UnicodeEncodeError: 'ascii' codec can't encode character [...]

Posted by user1461135 on Stack Overflow See other posts from Stack Overflow or by user1461135
Published on 2012-07-01T21:45:31Z Indexed on 2012/07/02 21:16 UTC
Read the original article Hit count: 259

Filed under:
|

I have read the HOWTO on Unicode from the official docs and a full, very detailed article as well. Still I don't get it why it throws me this error.

Here is what I attempt: I open an XML file that contains chars out of ASCII range (but inside allowed XML range). I do that with cfg = codecs.open(filename, encoding='utf-8, mode='r') which runs fine. Looking at the string with repr() also shows me a unicode string.

Now I go ahead and read that with parseString(cfg.read().encode('utf-8'). Of course, my XML file starts with this: <?xml version="1.0" encoding="utf-8"?>. Although I suppose it is not relevant, I also defined utf-8 for my python script, but since I am not writing unicode characters directly in it, this should not apply here. Same for the following line: from __future__ import unicode_literals which also is right at the beginning.

Next thing I pass the generated Object to my own class where I read tags into variables like this: xmldata.getElementsByTagName(tagName)[0].firstChild.data and assign it to a variable in my class.

Now what perfectly works are those commands (obj is an instance of the class):

for element in obj:
    print element

And this command does work as well:

print obj.__repr__()

I defined __iter__() to just yield every variable while __repr__() uses the typical printf stuff: "%s" % self.varname

Both commands print perfectly and can output the unicode character. What does not work is this:

print obj

And now I am stuck because this throws the dreaded

UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 47:

So what am I missing? What am I doing wrong? I am looking for a general solution, I always want to handle strings as unicode, just to avoid any possible errors and write a compatible program.

Edit: I also defined this:

def __str__(self):
    return self.__repr__()
def __unicode__(self):
    return self.__repr__()

From documentation I got that this

© Stack Overflow or respective owner

Related posts about python

Related posts about unicode